CLEF 2017 Microblog Cultural Contextualization Content Analysis task Overview

نویسندگان

  • Liana Ermakova
  • Josiane Mothe
  • Eric SanJuan
چکیده

The MC2 CLEF 2017 Content Analysis task deals with classification, filtering, language recognition, localization, entity extraction, linking open data, and summarization. Festivals have a large presence on social media. The resulting microblog stream and related URLs are appropriate to experiment on advanced social media search and mining methods. For content analysis, topics were in any language and results were expected in four languages: English, Spanish, French, and Portuguese.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tweet Data mining : the Cultural Microblog Contextualization Data Set

This paper presents an overview of the data set that was used for the Cultural Microblog Contextualization Workshop at CLEF 2016 and more specifically for the task 1: tweet contextualization. In this paper we first present a descriptive analysis of the data: we consider the variables or features associated with the tweets and analyse them. Then we also analyse the tweet textual content. The res...

متن کامل

Microblog Contextualization using Continuous Space Vectors: Multi-Sentence Compression of Cultural Documents

In this paper we describe our work for the MC2 CLEF 2017 lab. We participated in the content analysis task that involves filtering, language recognition and summarization. We combine Information Retrieval with Multi-Sentence Compression methods to contextualize microblogs using Wikipedia’s pages.

متن کامل

LIG at CLEF 2016 Cultural Microblog Contextualization: TimeLine illustration based on Microblogs

This paper presents the approach used by the LIG-MRIM research group to the participation of the task 3 (TimeLine illustration based on Microblogs) for the CLEF of Cultural Microblog Contextualization track. This task deals with the retrieval of tweets related to cultural events (music festivals) . For the content-based elements, we use the classical BM25 model [4]. Then, we diversify the resul...

متن کامل

Entity Recognition and Language Identification with FELTS

This working notes describe the experiments we conducted in the Microblog Cultural Contextualization Lab [2] of CLEF 2017 [3]. The microblog data is composed of very short texts, with very heterogeneous styles. Some of them are written in more than one language. We decided to takle the entity recognition problem by using a non-statistical, dictionary-based, multiword term extractor. On the othe...

متن کامل

A Tweets Classifier based on Cosine Similarity

The 2017 Microblog Cultural Contextualization task consists in three challenges: (1) Content Analysis, (2) Microblog search, and (3) TimeLine illustration. This paper describes the use of cosine similarity, which is characterized by the comparison of similarity between two vectors of an inner product space. This research used two approaches: (1) word2vec and (2) Bag-of-Words (BoW) for extractin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017